139 research outputs found
One-dimensional layout optimization, with applications to graph drawing by axis separation
AbstractIn this paper we discuss a useful family of graph drawing algorithms, characterized by their ability to draw graphs in one dimension. We define the special requirements from such algorithms and show how several graph drawing techniques can be extended to handle this task. In particular, we suggest a novel optimization algorithm that facilitates using the Kamada and Kawai model [Inform. Process. Lett. 31 (1989) 7–15] for producing one-dimensional layouts. The most important application of the algorithms seems to be in achieving graph drawing by axis separation, where each axis of the drawing addresses different aspects of aesthetics
Workshop on information heterogeneity and fusion in recommender systems (HetRec 2010)
This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in RecSys '10 Proceedings of the fourth ACM conference on Recommender systems
, http://dx.doi.org/10.1145/1864708.1864796
Knowledge-aware Complementary Product Representation Learning
Learning product representations that reflect complementary relationship
plays a central role in e-commerce recommender system. In the absence of the
product relationships graph, which existing methods rely on, there is a need to
detect the complementary relationships directly from noisy and sparse customer
purchase activities. Furthermore, unlike simple relationships such as
similarity, complementariness is asymmetric and non-transitive. Standard usage
of representation learning emphasizes on only one set of embedding, which is
problematic for modelling such properties of complementariness. We propose
using knowledge-aware learning with dual product embedding to solve the above
challenges. We encode contextual knowledge into product representation by
multi-task learning, to alleviate the sparsity issue. By explicitly modelling
with user bias terms, we separate the noise of customer-specific preferences
from the complementariness. Furthermore, we adopt the dual embedding framework
to capture the intrinsic properties of complementariness and provide geometric
interpretation motivated by the classic separating hyperplane theory. Finally,
we propose a Bayesian network structure that unifies all the components, which
also concludes several popular models as special cases. The proposed method
compares favourably to state-of-art methods, in downstream classification and
recommendation tasks. We also develop an implementation that scales efficiently
to a dataset with millions of items and customers
Automatically tagging email by leveraging other users' folders
Most email applications devote a significant part of their real estate to organization mechanisms such as folders. Yet, we verified on the Yahoo! Mail service that 70 % of email users have never defined a single folder. This implies that one of the most well known email features is underexploited. We propose here to revive the feature by providing a method for generating a lighter form of folders, or tags, benefiting even the most passive users. The method automatically asso-ciates, whenever possible, an appropriate semantic tag with a given email. This gives rise to an alternate mechanism for organizing and searching email. We advocate a novel modeling approach that exploits the overall population of users, thereby learning from the wisdom-of-crowds how to categorize messages. Given our massive user base, it is enough to learn from a minority of the users who label certain messages in order to label that kind of messages for the general population. We design a novel cas-cade classification approach, which copes with the severe scalability and accuracy constraints we are facing. Signifi-cant efficiency gains are achieved by working within a low dimensional latent space, and by using a novel hierarchical classifier. Precision level is controlled by separating the task into a two-phase classification process. We performed an extensive empirical study covering three different time periods, over 100 million messages, and thou-sands of candidate tags per message. The results are encour-aging and compare favorably with alternative approaches. Our method successfully tags 72 % of incoming email traf-fic. Performance-wise, the computational overhead, even on surge large traffic, is sufficiently low for our approach to be applicable in production on any large Web mail service. 1
Efficacy and safety of more intensive lowering of LDL cholesterol: a meta-analysis of data from 170 000 participants in 26 randomised trials
Background: Lowering of LDL cholesterol with standard statin regimens reduces the risk of occlusive vascular events in a wide range of individuals. We aimed to assess the safety and efficacy of more intensive lowering of LDL cholesterol with statin therapy. Methods: We undertook meta-analyses of individual participant data from randomised trials involving at least 1000 participants and at least 2 years' treatment duration of more versus less intensive statin regimens (five trials; 39 612 individuals; median follow-up 5·1 years) and of statin versus control (21 trials; 129 526 individuals; median follow-up 4·8 years). For each type of trial, we calculated not only the average risk reduction, but also the average risk reduction per 1·0 mmol/L LDL cholesterol reduction at 1 year after randomisation. Findings: In the trials of more versus less intensive statin therapy, the weighted mean further reduction in LDL cholesterol at 1 year was 0·51 mmol/L. Compared with less intensive regimens, more intensive regimens produced a highly significant 15% (95% CI 11–18; p<0·0001) further reduction in major vascular events, consisting of separately significant reductions in coronary death or non-fatal myocardial infarction of 13% (95% CI 7–19; p<0·0001), in coronary revascularisation of 19% (95% CI 15–24; p<0·0001), and in ischaemic stroke of 16% (95% CI 5–26; p=0·005). Per 1·0 mmol/L reduction in LDL cholesterol, these further reductions in risk were similar to the proportional reductions in the trials of statin versus control. When both types of trial were combined, similar proportional reductions in major vascular events per 1·0 mmol/L LDL cholesterol reduction were found in all types of patient studied (rate ratio [RR] 0·78, 95% CI 0·76–0·80; p<0·0001), including those with LDL cholesterol lower than 2 mmol/L on the less intensive or control regimen. Across all 26 trials, all-cause mortality was reduced by 10% per 1·0 mmol/L LDL reduction (RR 0·90, 95% CI 0·87–0·93; p<0·0001), largely reflecting significant reductions in deaths due to coronary heart disease (RR 0·80, 99% CI 0·74–0·87; p<0·0001) and other cardiac causes (RR 0·89, 99% CI 0·81–0·98; p=0·002), with no significant effect on deaths due to stroke (RR 0·96, 95% CI 0·84–1·09; p=0·5) or other vascular causes (RR 0·98, 99% CI 0·81–1·18; p=0·8). No significant effects were observed on deaths due to cancer or other non-vascular causes (RR 0·97, 95% CI 0·92–1·03; p=0·3) or on cancer incidence (RR 1·00, 95% CI 0·96–1·04; p=0·9), even at low LDL cholesterol concentrations. Interpretation: Further reductions in LDL cholesterol safely produce definite further reductions in the incidence of heart attack, of revascularisation, and of ischaemic stroke, with each 1·0 mmol/L reduction reducing the annual rate of these major vascular events by just over a fifth. There was no evidence of any threshold within the cholesterol range studied, suggesting that reduction of LDL cholesterol by 2–3 mmol/L would reduce risk by about 40–50%
BLOB : A Probabilistic Model for Recommendation that Combines Organic and Bandit Signals
A common task for recommender systems is to build a pro le of the interests
of a user from items in their browsing history and later to recommend items to
the user from the same catalog. The users' behavior consists of two parts: the
sequence of items that they viewed without intervention (the organic part) and
the sequences of items recommended to them and their outcome (the bandit part).
In this paper, we propose Bayesian Latent Organic Bandit model (BLOB), a
probabilistic approach to combine the 'or-ganic' and 'bandit' signals in order
to improve the estimation of recommendation quality. The bandit signal is
valuable as it gives direct feedback of recommendation performance, but the
signal quality is very uneven, as it is highly concentrated on the
recommendations deemed optimal by the past version of the recom-mender system.
In contrast, the organic signal is typically strong and covers most items, but
is not always relevant to the recommendation task. In order to leverage the
organic signal to e ciently learn the bandit signal in a Bayesian model we
identify three fundamental types of distances, namely action-history,
action-action and history-history distances. We implement a scalable
approximation of the full model using variational auto-encoders and the local
re-paramerization trick. We show using extensive simulation studies that our
method out-performs or matches the value of both state-of-the-art organic-based
recommendation algorithms, and of bandit-based methods (both value and
policy-based) both in organic and bandit-rich environments.Comment: 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining,
Aug 2020, San Diego, United State
Discrete deep learning for fast content-aware recommendation
Cold-start problem and recommendation efficiency have been regarded as two crucial challenges in the recommender system. In this paper, we propose a hashing based deep learning framework called Discrete Deep Learning (DDL), to map users and items to Hamming space, where a user's preference for an item can be efficiently calculated by Hamming distance, and this computation scheme significantly improves the efficiency of online recommendation. Besides, DDL unifies the user-item interaction information and the item content information to overcome the issues of data sparsity and cold-start. To be more specific, to integrate content information into our DDL framework, a deep learning model, Deep Belief Network (DBN), is applied to extract effective item representation from the item content information. Besides, the framework imposes balance and irrelevant constraints on binary codes to derive compact but informative binary codes. Due to the discrete constraints in DDL, we propose an efficient alternating optimization method consisting of iteratively solving a series of mixed-integer programming subproblems. Extensive experiments have been conducted to evaluate the performance of our DDL framework on two different Amazon datasets, and the experimental results demonstrate the superiority of DDL over the state-of-the-art methods regarding online recommendation efficiency and cold-start recommendation accuracy
Trust and Reputation Modelling for Tourism Recommendations Supported by Crowdsourcing
Tourism crowdsourcing platforms have a profound influence
on the tourist behaviour particularly in terms of travel planning. Not
only they hold the opinions shared by other tourists concerning tourism
resources, but, with the help of recommendation engines, are the pillar
of personalised resource recommendation. However, since prospective
tourists are unaware of the trustworthiness or reputation of crowd publishers,
they are in fact taking a leap of faith when then rely on the
crowd wisdom. In this paper, we argue that modelling publisher Trust &
Reputation improves the quality of the tourism recommendations supported
by crowdsourced information. Therefore, we present a tourism
recommendation system which integrates: (i) user profiling using the
multi-criteria ratings; (ii) k-Nearest Neighbours (k-NN) prediction of the
user ratings; (iii) Trust & Reputation modelling; and (iv) incremental
model update, i.e., providing near real-time recommendations. In terms
of contributions, this paper provides two different Trust & Reputation
approaches: (i) general reputation employing the pairwise trust values
using all users; and (ii) neighbour-based reputation employing the pairwise
trust values of the common neighbours. The proposed method was
experimented using crowdsourced datasets from Expedia and TripAdvisor
platforms.info:eu-repo/semantics/publishedVersio
- …